3574 results found.
Multimodal/Multimedia
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
OpenSource
Size:
11156 entries Production Status:
Existing-used
Use:
Dialogue
-
Paper title:TMT: A Transformer-based Modal Translator for Improving Multimodal Sequence Representations in Audio Visual Scene-aware Dialog
-
Paper track:10.1 Multimodal systems/Oral Presentation
-
Paper status:Accept - Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Wubo Li | Audio Visual Scene-aware Dialog | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Attribution 4.0 International
Size:
17.7 GByte Production Status:
Existing-used
Use:
Speech enhancement
-
Paper title:Towards Speech Robustness for Acoustic Scene Classification
-
Paper track:5.5 Speech and audio classification/Oral Presentation
-
Paper status:Accept - Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Shuo Liu | Edinburgh Noisy Speech Database | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
From Data Center(s)
License:
LDC
Size:
None Production Status:
Existing-used
Use:
Speech Enhancement
-
Paper title:INTERSPEECH 2020 Deep Noise Suppression Challenge: A Fully Convolutional Recurrent Network (FCRN) for Joint Dereverberation and Denoising
-
Paper track:13.10 Deep Noise Suppression Challenge/Oral Presentation
-
Paper status:Accept Special Session
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Maximilian Strake | CSR-I (WSJ0) Sennheiser | /N |
Documentation:
https://catalog.ldc.upenn.edu/docs/LDC93S6B/
Speech
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
From Owner
License:
FDP data transfer form
Size:
15 hours Production Status:
Newly created-in progress
Use:
Emotion Recognition/Generation
-
Paper title:The MSP-Conversation Corpus
-
Paper track:3.1 Analysis of speaker states/Oral Presentation
-
Paper status:Accept - Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Carlos Busso | MSP-Conversation corpus | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
From Data Center(s)
License:
LDC
Size:
6300 sentences Production Status:
Existing-used
Use:
Speech Recognition/Understanding
-
Paper title:U-net based direct-path dominance test for robust direction-of-arrival estimation
-
Paper track:5.9 Speaker spatial localization/Oral Presentation
-
Paper status:Accept - Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Hao Wang | TIMIT | /N |
Documentation:
https://catalog.ldc.upenn.edu/docs/LDC93S1/
Speech
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
License:
LDC
Size:
None Production Status:
Existing-used
Use:
Machine Learning
-
Paper title:Multi-talker ASR for an unknown number of sources: Joint training of source counting, separation and ASR
-
Paper track:5.8 Source separation and computational auditory s/Oral Presentation
-
Paper status:Accept - Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Thilo von Neumann | Wall Street Journal (WSJ) Corpus | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Public
Size:
2.6 GByte Production Status:
Existing-used
Use:
Speech Synthesis
-
Paper title:Attention Forcing for Speech Synthesis
-
Paper track:7.5 Towards end-to-end speech synthesis/Oral Presentation
-
Paper status:Accept - Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Qingyun DOU | The LJ Speech Dataset | /N |
Documentation:
Available in English
Speech
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
From Data Center(s)
License:
LDC
Size:
622 MByte Production Status:
Existing-used
Use:
Speech Recognition/Understanding
-
Paper title:BLSTM-Driven Stream Fusion for Automatic Speech Recognition: Novel Methods and a Multi-Size Window Fusion Example
-
Paper track:8.6 Neural network training methods (including new/Poster Presentation
-
Paper status:Accept - Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Timo Lohrenz | TIMIT Acoustic-Phonetic Continuous Speech Corpus | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Available through DementiaBank
License:
Size:
1.4 GByte Production Status:
Newly created-finished
Use:
Dementia prediction
-
Paper title:Alzheimer's Dementia Recognition through Spontaneous Speech: The ADReSS Challenge
-
Paper track:13.7 Alzheimer's Dementia Recognition through Spon/Oral Presentation
-
Paper status:Accept Special Session
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Saturnino Luz | ADReSS Alzheimer's Dementia Dataset | /N |
Documentation:
Documentation in English included in the distribution as README files.
Speech/Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Special challenge
License:
Size:
None Production Status:
Existing-used
Use:
Alzheimer's prediction (paralinguistics)
-
Paper title:Using state of the art speaker recognition and natural language processing technologies to detect Alzheimer’s disease and assess its severity
-
Paper track:13.7 Alzheimer's Dementia Recognition through Spon/Poster Presentation
-
Paper status:Accept Special Session
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Raghavendra Pappagari | Dementia Bank | /N |
Documentation:
None




